


NATIONAL SECURITY AGENCY 



MECHANIZATION IN SUPPORT 
OF COMINT 



PHASE I 



WARNING 



THIS DOCUMENT CONTAINS CODEWORD MATERIAL 



TOP SECRET COPjTROL kus?.ber 301 51 4 1 
coPY-Ji— -COP^ 

PAGE OF PAGES 



eclassified and approved for release bv MSA on 08-16-2013 pursuant to E.Q. 1352 





NATIONAL SECURITY AOENGY 
Washington 25> D. C* 

MECHANIZATION IN SUPPORT OE COMINT 
Phase I 

I 

Compiled by 
R/b Personnel 



October 1954 



NSA Forin 781-CIOS 1 Jul 52 




'PL 86-36/50 use 3 
go 3.3 (h) (2) 



TJVBLE OF CONTENTS 



Page 

Introduction 1 

PHASE I 

1., Comint Collection Activities 3 

(a) Functional Categories of Intercept Problems 3 



3. Government Communications Service. . . 

4. Military Tactical, , 

5. Military Strategic 

6. Support Communications 




II. Operations on Traffic Independent of System . . 10 

(a) Preliminary Processing .......... 10 

1, Logging and Editing Problems 11 

2, Data Conversion Problems 12 

3, Data Storage and Recovei^ Problems . . 12 

(B) Traffic Analysis 13 

(C) Textual Analysis 14 

III. Diagnostic Operations . 15 

(A) Search for and Statistical Evaluation of 

Phenomena 15 

1. Identity Problems 16 

2. Latent Problems . 17 

(B) Test of Specific HYpotheses 18 

1. Machine Systems. 18 



NSA Form 78UC10S 1 Jul 52 



TOP SECRET FROTH 




\jr\M2j j. 



nm 



TftBLE OF CONTENTS 



2. Hand Systems 



Operations Based on Knowledge of the 
General System 



1, Analysis of Depths 

а. Depth Search 

h. Depth Testing. ..... 

б. Depth Reading. 

2* Machine Recovery and Reading 
3* Decryption . . « « ^ . 



1. Additive Enclphermeftt . . . , 

2. Exploitation of Statistical 

Phenomena 



(a) Machine Systems. 20 



(B) Hand Systems 25 



3. Additional Complex Procedures 28 



Support Functions 



(A) Linguistic and Statistical Aids 30 

(B) Generation of Crypto- system Data 30 

(C) Desk Aids 31 

(D) Cryptanalytic Research 3i 

(E) Collateral 31 



NSA Form 781-ClOS I Jul 52 



INTRODUCTION 

This study Is an evaluation of the analytical machine 
and Intercept equipment phases of the present Research and 
Development (NSA-30) program In light of the present and 
projected problems of Communications Intelligence, The 
conclusions drawn from this evaluation (should) result In 
a Research and Development program consistent with Comint 
mechanization and traffic requirements. 

The study has four phases. The following paragraphs 
serve as an Introduction to the first two phases only. 

The present objectives of phases III and IV are Included, 
but may change considerably as phases I and II mature. 

. Phase I attempts to uncover the areas where mech- 
anization Is, or could be, of use to the Comint Effort. In 
phase I no attempt Is made to evaluate previous or present 
efforts at mechanization In these areas. 

Phase II considers only the areas uncovered In phase 
I, and attempts to give a quantitative estimate of the 
success of the present effort In each of these areas and 
thus reveal the shortcomings In the present R and D 
program. 

Phase III will consider each of the shortcomings 
discussed In phase II, and will list possible procedures. 



NSA Form 78UC10S 1 Jul 52 



TOP SECRET FROTH 




TOP S 



j. 



ROTH 



techniques, etc. which can be brought to bear on these 
areas . 

Phase IV will analyze the relative merits of the 
procedures, techniques, etc. set forth In phase III and 
synthesize the results to formulate an R/D program which 
will satisfy COMINT requirements, be within technological 
limitations, and be consistent with agency and government 
policy. 

The collection of the Information contained in the 
first two phases of this study was an effort directed by 
R/t> personnel. Key personnel from Production have will- 
ingly and ably assisted in the actual writing of this re- 

I 

port and In offering pertinent Information thru discussions 
and conferences where and when needed. While PROD person- 
nel assisted in the formation of this report, the inter- 
pretations contained herein are those formulated by R/D 
personnel. It is R/D’s hope that these Interpretations 
conform with those of the other Offices. 

That there are many errors of omission and commission 

/ 

in this report should be ”a priori" knowledge. The R/D 
Office will be most grateful to persons bringing these 
errors to Its attention so that a more accurate and more 
complete picture may be obtained. 



2 



TTOM gi7r’i7T?'T 



NS A Form 78UC10S 1 Jul 52 



I. COMINT Collection Activities 



The COMINT Collection Problem divides Into two phases: 
the Interception of the various signals discussed below, 
and their subsequent transmission to NSA Headquarters and 
distribution to the Interested analysts. 

Sometimes analysts require information about a signal 
over and above that provided by a complete transcription 
thereof. For example,. it may be required that an in- 
dication of the quality of interception of enciphered 
traffic be supplied with the traffic on a character^by- 
character or a signal element-by-slgnal element basis, 

_ .IVnojther requirement is that of providing the analyst a 
record of time intervals beWeen successively trans- 
mitted characters. Still another requirement is the 
inclusion of information which will serve as a means of 
providing transmitter identification and in some cases the 
identification of individual operators. 

These requirements apply generally to several of the 
intercept situations. 

(A) Functional Categories of Intercept Problems 

The various transmissions of interest to NSA may 
be broken down into f\anctlonai categories. These function- 
al categories provide certain degrees of information about 
the characteristics of the transmissions and thus some 

3 . 



NSA Form 781-C10S 1 Jul 52 



TOP SECRET FROTH 




'PL 86-36/50, use 3605 
EO 3.3 (h) (2) 

Infonnatlon ifegardlng the intercept problem. Internal 
networks often create substantial problems In obtain- 
ing good Intercept sites. 




3. Government Communications Services 



This classification is used here for inter- 
nal networks of various government organizations such asi 
border guards, secret police, police, agent transndssions, 
and the like. These nets are generally widespread, have | 
low traffic densities, andl irreg-l 



ular schedules. 






4. Military Tactical 

Military Tactical traffic (sometimes re- 
ferred to as low-level traffic) Is usually Intercepted 
and often processed, to some extent, by the supporting 
, service COMINT organizations. This material Is also 
forwarded to NS<\. This traffic Is usually low powered, 
and transmitted by simple communication systems with 
Irregulaxr schedules. 



' 5. Military Strategic Pl 86-36/so use 3 

EO 3.3(h)(2) 

This category consists of high-level 
mllltaxTT communications up to the very highest echelon. 

High traffic densities and regular schedules character- 
ize these transmissions. 1 I 



6, . Support Communications 

This category comprjlses those comm^icatlon 
services used to support various types of operations. 
These may be either broadcast or point-to-point trans- 
missions. Weather nets and broadcasts, and navigational 
aids and services are examples of support communications. 

6 



NSA Form 78UC10S 1 Jul 52 



TOP SECRET FROTH 







P'L 86-36/50 use 3605 
EO 3.3 (h) (2) 




II. 




Operations on Traffic Independent of System 
(A) Preliminary Processing 

Kany operations must be performed on traffic 



10 



TnrkP girr’ OTr'T' 



NSA Form781-C10S ljul 52 





T OP 

arriving at NS A prior to its distribution to the users. 
These operations fall under the general headings of 
(l) Logging and Editing, (2) Data Conversion and ( 3 ) Data 
Storage and Recovery. The operations performed during 
these preliminary processing stages are independent of 
the crypto-system (if any) employed on the traffic by 
the sender prior to transmission. These operations 
performed on the intercepted traffic and the final forms 
and nximbers of copies required are governed by the needs 
and desires of the analysts, machines, etc., which must 
further process and analyze the traffic. The problems 
in this area which appear to be amenable to mechaniza- 
tion are as follows ; 

1 . Logging and Editing Problems 

Generally these problems encompass the 
optimization and mechanization of the logging and edit- 
ing functions which Involve the logging, processing, 
division of page copy into identifiable systems, and 
the deletion of uninformative material. 

Specifically the problems involve the keep- 
ing of a record of all traffic and the sections of 
analysts which are responsible, the sorting of messages 
in an order pre-determined by a format, and recording 
the messages in a log book by hand or by some niachine 

11 



NSA Form 781-C10S 1 Jul 52 



TOP SECRET FROTH 






process. Duplicate messages are noted at this point and 
messages are given worksheet numbers. 

The editing problems Include the deletion 
of uninformative material^ reordering of Information, 
discrimination between textual and non-textual grodps, 
correction of group length and run- together groups. 
Indent If Icat Ion of duplicate messages, etc. 

2. Data Conversion Problems 

Traffic arrives at NSA In as many as four 
different forms: hard copy, perforated paper tape, mag- 
netic tape, and occaslor.ally, punched cards. Multiple 
copies or copies from which unwanted material is deleted 
may be required In more than one of these forms. Be- 
cause of this the general problems of this area center 
about devising procedures and effecting mechanization 
for converting the original traffic into the various 
desired forms , 

The specific problems Involve the prepara- 
tion of bard copy where it does not exist, and the pre- 
paration of edited versions of these data In punched 
cards, perforated paper tape, and magnetic tape as re- 
quired as Inputs to the various analytic machines. 

3. Data Storage and Recovery Problem 
The general problem here involves the 



12 



TOP SECRET FROTH 



NSA Form 781- CIOS Uul 52 




selection of permanent media on which to reproduce all 
had copy received, the storing of information of use to 
the intelligence analyst, and ,;the maintenance of files 
containing all copy handled. 

Particular problems are: the reproduction 
on' a permanent medium, such as microfilm, and the num- 
bering In the order of arrival of all had copy traffic; 
the storage and filing of collateral Information of use 
to the intelligence analysts; keeping the files current 
by continual addition of both new subjects and new 
information under old subjects; and the immediate re- 
covery of material from storage or files as required, 
(B) Traffic Analysis 

Traffic ^Inalysls (T/A.) is that area of NS A 
primarily concerned with the problems of obtaining 
communications Intelligence from non-textual informa- 
tion. A problem of this area Is the reconstruction of 
communication networks which carry communication in- 
telligence of interest. A second problem involves the 
attempt to determine the "order of battle" of the 
communicator, l.e,, the location and strength of the 
communicator's armed forces and strategic material and 
the identification and location of key personnel. 

The specific problems are those of performing 
the sorting and indexing processes and other clerical 
operations of traffic analysis. Large voluraes of 



NSA Form781.ClOS ljul 52 



TOP SECRET FROm 




traffic must be scanned to uncover frequencies, call signs, 
message serial numbers, transmittal dates and times, page 
and pad numbers, addresses, and operator mistakes. After 
studies are made of these results, coiranunlcatlon nets are 
reconstructed or modified. 

Techniques which- are applicable In certain situa- 
tions to network reconstruction, are the recovery by crypt- 
analytic processes, of call signs (transmitting station 
Identification) which have been enciphered. These crypt- 
analytic processes are similar to some extent to those 
which are applicable to textual analysis. 

(C) Textual Analysis 

Textual analysis deals with the problems involved 
In the selection of messages of potential Intelligence 
value from the large volume of plain text and plain code 
messages handled by NSA, In plain text messages the 
communicators make no attempt to conceal the Information 
content. Thus the problem Is one of scanning messages 
and selecting those messages which contain certain elements 
as messages of Intelligence value. 

In plain code messages an unenelphered code Is 
employed for privacy and brevity. Here the problem is 
essentially that encountered In the scanning of plain 
text, with some complicating and some simplifying factors. 




Ttie particular proolcm3 in this area are: to 




scan rapidly an extremely large volume of plain text mes- 
sages to see whether they contain certain words, addresses, 
signatures, etc.; to print out the plain text messages 
containing the elements sought for; and to categorize the 
selected messages by subject matter using words, addresses, 
etc. as the criteria. 

The problems listed above are quite similar to 
some discussed previously. 

For plain code messages the same problems exist 
except that the problems are simpler in that code groups 
are usually uniform in length, but complicated by the fact 
that many codes can appear over a communications link. 

A problem unique with plain code messages is the 
automatic decoding and printing, when the code book Is 
known, 

A side problem is one of determining whether a 
secret code or a known coiranerclal code has been employed, 
III, Diagnostic Operations 

(A) Search for and Statistical Evaluation of 
Fhenome^ 

Where messages appear which are neither plain 
text nor in some readily' readable enciphering system, we 
must, in the absence of outside information, analyze the 

15 



NSA Form 78VC10S 1 Jul 52 



text of the messages In order to determine the cryptosystem 
Involved. The analysis may be of the texts themselves 
(Identity) or upon derivative forms of the texts (Latent). 

I 

The problems of this area encompass the procedures in and 
the mechanization of: ( 1 ) the operations by means of which 
original text is converted to the desired latent form, 

(2) the diagnostic operations Involved in the search for 
and statistical evaluation of phenomena which arise in the 
original or derived text. 

We list the problems to be considered xinder the 
two headings Identity and Latent according to which form 
of text is being considered. 

( 1 ) Identity Problems 

Here a problem is one of searching for the 
exploitable intrinsic characteristics of plain language and 
its encipherments.' Examples are the widely varying fre- 
quencies at which the individual letters occur in literal 
text and the cohesion of plain language as exhibited by 
the rough frequency distribution of pairs of letters 
(digraphs), triples of letters (trlgraphs), etc. Under en- 
cipherment the intrinsic characteristics of plain language 
are partially or totally destroyed. However, new charac- 
teristics may become prominent and these may be employed 

16 



NSA Form 781-C10S 1 Jul 52 



TOP SECRET FROTH 



TOP 

In the characterization of the enciphering system. An 
example is the Enigma machine in which no letter may be en- 
ciphered into Itself, 

( 

Once a type of message having particular 
characteristics is identified, a problem is to collect all 
messages of that type so that the benefits of a large stat- 
istical population may be employed. 

(2) Latent Problems 

Sometimes a characteristic of cipher text 
is found, not in the text as sent, but in something derived 
from the text. For example, we can assign the numbers from 
1 to 26 to the letters from A to Z , A v;ill be considered 
to follow Z, as in a cyclic arrangement. By the use of this 
we Introduce a measure of distance between letters of the 
alphabet. We sometimes "difference” the letters in a mes- 
sage by replacing each letter by its assigned number and 
then subtracting each assigned number from i‘.;s predecessor 
in the message. In some cases the distance from a given 
letter to its next occurrence is used to replace the letter. 

The problems met in the processing and an- 
alysis of Latent text are essentially those of Identity text. 
However, the problem of transforming the original text into 
the desired Latent form and the preparation of the Latent 
data for machine processing exists and is not fully includ- 
ed in the problems of Identity. 





(B) Test of Specific Hypotheses 

On hypothesizing the enciphering process 
employed in the production of the messages under analysis^ 
more specific tests may be employed to detennlne the val- 
idity of the original hypothesis. The hypotheses fall 
Into two general classes: (1) encipherment was performed 
by a machine system, (2) encipherment was performed by a 
non-machine (hand) system (unenciphered and enciphered . 
codes are included here) . 

Each encipherment system produces certain 
distinct and observable characteristics which are dlsoem- 
able message-wise or over a selected group of messages* 

The problems here concern the mechanization of the tests^ 
operations, etc. employed In searching for and establish- 
ing the existence of these observable characteristics* 

1. Machine Systems 

Certain machine systems produce cipher 
text having marked characteristics. For example, the 
Enigma type machines cannot encipher a letter Into Itself* 
This phenomenon tends to make Enigma cipher text non-ran- 
doiA, In the sense that high frequency plain letters tend 
to occur less frequently in cipher text. 

The specific problems here Involve the 
processing of varying amounts of text to ascertain whether 

18 




X OP O XH 

characteristics are present. The proces usually Includes 
statistical computations, 

2 . Hand Systems , 

Whenever It is believed that a nevr unen^^lpher- 
ed code is in use, messages employing this code must be sep- 
arated, from others sent over the same communications link. 
This can be done by comparing all groups within the message 
against themselves for coincidences to obtain the internal 
”clpher”-group repetitions. A problem here is to perform 
the coincidence tests and obtain the internal ”cipher”- 
group repetitions. 

I'Then code structures are partially or fully 
known, all code groups in a message may be compared with the 
known code groups and scored statistically. The problem 
here is to perform the comparisons and do the statistical 
scoring . 

When codes are enciphered by means of an 
additive pad, reuses may occur on a different communications 
link at a later time. If an additive pad has been re- 
covered from the first use, other messages may be decipher- 
ed with the recovered key and the resulting texts inspected 
as in the first paragraph to see whether it looks like 
either kriown or unknown unenciphered code. 

Some codes have a "built in" checking feature 

19 

TOP SECRET FROT fr 






I^hlph allows the message receiver to blieck his text as tc 
^^3?toles, lost portions of code groupsi fefcc, -One -such check- 
i]^ fedtur4 Is that the last digit of the code group is J:he 
Slim of- the other digits of the, code group. Suqh 
charac^terlstlcs can be utilized to Identify codes.. The 
problem, here is one of finding and exploiting these check- 
Ing :features. 

IV. Operations Based on Knowledge of the General, System 



Encipherment of text is performed by means of a 
machine or hand system. In some cases v:e are able to 
assume with a high degree of certainty that a particular 
message or set of messages was enciphered (encoded) by 
means of a specific machine or hand system. If v/e possess 
statistical, partial or complete knowledge of the variable, 
elements of the system involved, then certain statistical, 
exhaustive, or. analog attacks may be made to recover the 
messages and some of the remaining unknoi-ms cf the crypto- 
system. 

For convenience we list the problems under two. head- 
ings* (l) machine systems and (2) hand systems. However, 



a great deal of overlap exists as will be seen below. 
(a) Machine Systems 



PL 86-36/50 use 
EO 3.3(h)(2) 




NSA Form 781-C10S 1 Jul 52 





2, Machine Kecovery and Set-glng 

Cipher machines are usually set up on two 
bases. The first is an arrangement of the internal elements 



22 

TOP SECRET FROTH 



NSA Form 781-C10S 1 Jul 52 






of the -machine which may be kept unchanged for a relative- 
ly long period (days, months or even years). This arrange- 
ment is called the Internal setting. For example, the 

I 

wiring of the rotors,, the order of rotors, the plugged 
substitution between the keyboard and the input side of a 
rotor maze of the Enigma machine, and the pins on the 
Hagelin wheels fall into this class. 

In addition, thei*e are usually means for start- 
ing the machine at some point in its cycle by setting cer- 
tain marks (usually letters or numbers) on the rotors oppo- 
site their respective bench marks. This is called the 
message setting. 

The internal and message settings are considered 
as the variables of the machine. The problem here is to 

recover the internal setting (machine recovery) and the 

PL 86-36/50 use 3605 

message setting (setting recovery), EO 3. 3(h)(2) 




tively, 

23 




A problem which Involves a combination of logi- 
cal and exhaustive trial techniques follows. Let us assume 
that an Enigma machine enciphered the message. Let us 
assume also that we are given a crib (a length of plain 
text corresponding to a certain stretch of cipher text) free 
of garbles. In addition, certain of the Internal variables 
are assumed. This narrows the field covered by exhaustive 
trials to the remaining unknov/n variables. We also re- 
quire an analogue to the cipher machine which supplies us 
with the Enigma motion and its Internal elements. Then by 
assximlng one or more features of each of the remaining un- 
known variables, we employ the crib and machine analogue 
to derive a set of implications. If the implications are 
contradictory, then our original assumptions were inco;?rect. 
We run through the set of possible assumptions and keep only 
those which lead to non-contradictory statements and which 
pass a certain predetermined threshold, 

f 

2 ^ 1 - 

TOP SECRET FROTfr 



NSA Form 78UCI0S I Jul 52 




TOP SE 0K’^ F^F R OTH 

The number of problems Involved in the attacks 
on machine systems is too large to list here. These prob- 
lems are listed in Appendix A to Phase I, 

I 

3, Decryption 

At times, a cryptographic period is completely 
broken, i.e., all the daily settings are recovered, and the 
message settings are recoverable as part of • the message. 

We are then in the same situation as the intended addressee. 

Hence the problem is to decrypt the traffic as it becomes 
available and print the resulting plain text. Ho\<rever, 
the high deterioration rate of Intelligence makes it neces- 
sary to perform this decryption and printing as rapidly as 
possible, 

PL 86-36/50 use 36 

(S) Hand Systems EO 3.3(h)(2) 

1 . Additive Encipherment 

Successful recovery of additively enciphered 
messages depends on the existence of 



Sometimes the key is predictable 
because the text of some known book (like a table of logar- 
ithms or a novel) is used as though it were a sequence of 
symbols. At other times, the symbol sequence is generated 
by a typist striking keys in what he believes to be purely 
random fashion. The problem is to provide the analyst 
with pairs of groups, one of which has a probability of 
occurring as a fragment of plain text and the other of which 




has a probability of b^ing produced as typewriter random,, 
and paired in such a way that if both had occurred, they 
would have produced a group of the actual cipher text. 

I 

It Is desirable that with the pairs of groups, the product 
of their probabilities of occurrence be also presented. 

Again, the sequence may be generated by a machine built for 
the purpose. Once a large enough sample, is obtained, the 
problem is to construct an analogue of the machine so that 
we have a means of generating key similar to the key which 
will be used. This last problem is exactly like the prob- 
lem of machine recovery. 

The problems of additive analysis fall into 

four classes. 

(la) The first class of problems deals with system 
discrimination which tries to Identify the elements making 
up the enciphering process. Both the steps in the encipher- 
process and the enciphering elements are sought. 

(lb) The second class of problems are those of Indica- 
tor recovery, or identifying and exploiting the decipherment 
information sent in the message such as pad n\amber and 
starting point. 

(ic) The third class of problems encompasses the 
search for depths, or the discovery of messages enciphered 
by the same pad. The search for isologs (messages having the 
same plain text but enciphered by different pads made up of 
exploitable additive) is included here. 

TTIH FPnTtt 



NSA Form 781-ClOS 1 Jul 52 



(id) The fourth class of problems are those of ex- 
ploitation, or the actual recovery of the plain text or 
plain code using methods previously discussed, 

2, Exploitation of Statistical Phenomena 

The case of carelessly generated key which 
was discussed above Is subject to attack. In the case of 
the typist discussed above, we can generate key similar to 
that which he produces. We then try It out at various posi- 
tions in a message and examine the resulting decryption to 
see If It looks like plain text. A unique problem here Is 
the generation of likely key. The problems of placement of 
key in the message, the decryption of the message employing 
this key, and the recognition of "good” plain text- have 
been encountered above. 

In another case we have a different phenome- 
non. Studyltig previously recovered key of the system, we 
notice that not all key values occur with equal frequency. 

In this case we say that the distribution of key values Is 
statistically rough. Furthermore, we may notice other 
statistical properties which more exactly describe the 
distribution than the mere statement of its roughness. 

Let us say that' some central station using this system 
sends the same message to a number of its outlying stations, 
but enciphers this message with a different pad for each 

27 



NSA Form 781-ClOS 1 Jul 53 



TOP SECRET FROTB h 



TOr ^HOTII 



transmission. The practical reason for this is that recip- 
ient station h does not have the same one-time pads as sta- 
tion B or C, nor do B and C hold like pads, because a one- 
time pad is usually made in only tx^ro copies, one for each 
of a pair of correspondents. 

Now xve have a sort of reverse "depth”, a set 
of cipher messages in which the texts are identical but the 
key for each is different. This set of messages can be broken 
in the sense that the single plain text and all keys can be 
recovered. To affect the solution we assume some plain text 
and derive key. For each message the correct assumption 
should produce key which has the previously noted statistical 
properties. In fact, good assumptions in various portions 
of the text would be consistent and yield better Information 
about the statistical nature of key population which in turn 
is used to improve or direct assumptions in other portions 
of the text. 

The problems here Include the placement of 
plain text in the alignment of messages, the deciphering proc- 
;ess to obtain the underlying key, the statistical tests made 
on the key, and the identification of good results. 

3. Additional Procedures 

One case of this is the solution of a colum- 
nar transposition. The cryptography consists of writing a 
message, in the usual left to right manner, into a 

28 



NSA Form 781-ClOS 1 Jul 52 



TOP SECRET FROTH 




-gjpQp g - ^DgjacfS Q TII 



rectangular box. Then the Columns of letters are v/ritten 
out, but the columns are chosen in a mixed order, not merely 
from left to right. In crypt analyzing, we -take advantage of 
our knowledge of language letter frequencies. We assume that 
a certain stretch of message was a single column. Then v;e 
try other stretches of the message as the column which vras 
originally to the right of the selected column at the time 
the message was written into the rectangle. In the correct 
case, the pairs of adjoining letters produced will have plain 
text dlgraphic properties. Furthermore, since we know the 
statistics of the language, we weight each pair by the prob- 
ability of occurrence and choose the correct match by the best 
total score. 

Then a third column is added, and our knowledge 
of letter triples forms a basis for scoring, etc. Thus in a 
system of this type the problem is one of storing probability 
weights and data handling to make and score assumed matches 
of stretches of text, 

A second example Is the case where it is known 
that additive key was not furnished to the encipherer as 
a one-time tape or reusable tape or pad, but was generated 
by a known complex manual process. Some specific variable 
or variables of the process are not known. Here it is 
necessary to duplicate the steps followed by the encipher- 
ing clerk, making assumptions about the unknovm variables. 

Each assumption results in "trial" additive whldh can be 



NSA Form 781-C10S 1 Jul 52 



TOP SEOtfeT FROTH 



applied to the message text. The result, called "pseudo- 
plain”, can be examined to see If It is plain text, or has 
the statistical properties of plain text, 

V, Support Functions 

These functions embrace a number of different activities. 
These activities are somewhat general to all cz*yptanalytlc 
work and are not based on any particular traffic or system, 

A. Linguistic and Statistical Aids 

A number of special dictionaries and statistical 
studies based upon various languages of interest are re- 
quired as aids to cryptanalysts. The dictionaries and 
statistics may be based upon particular traffic decryptions 
or upon general samples of the language. Dictionaries arranged 
in special ordering (such as ordered alphabetically on last 
letter of the words) may be desired. Frequent revision of 
dictionaries and statistical studies are required. Besides 
linguistic studies large numbers of special mathematical 
and statistical tables are prepared. These have included 
special Poisson, Binomial and Multinomial tables, and others, 
;B, Generation of Crypto-System Data 

It is sometimes desired to provide the analyst with 
listings of data pertaining to a particular cipher machine 
system. For example, tables which enable computation df 
cycle distance between specific machine settings, and lists 
of successive settings of a cipher machine. In certain hand 

30 



NSA Form 781-ClOS 1 Jul 52 



TOP SECRET FROTH 



systems in which the combining of texts is done in several 
steps, tables showing the end results of the several steps 
for various variables may be desired. 

C. Desk Aids 

There are a number of small devices which may be 
provided for the individual cryptanalyst to use directly 
along with his work. These include commercial adding ma- 
chines and desk calculators, individual cipher machines, 
analogues of portions of various crypto-processes, tallying 
counters and the like. Such devices may be useful, provid- 
ing that they are actually available to the person while 
working . 

D. CiTPtanalytlc Research 

There are times, in the course of current crypt- 
analysis, during which elements appear which are not sub- 
ject to regular periodic change, but are such that a 
change in them would obviate current methods of attack. 

Again there are times when information as to new cipher 
machines or systems not in use, but offered for sale, be- 
comes available. In this sort of situation a substantial 
i effort to prepare to solve a problem which may never 
materialize is Justified, 

E. Collateral 

Cryptanalysis is not performed in a vacuimi. Often 
what helps ar attack on messages is some notion of v/hat they 
are likely to be about . Sometimes a name recovered from 

31 



NSA Form 781-C10S 1 Jul 52 



TOP SECRET FROTH 




one message gives clues as to the subject matter or 

\ 

personalities probably referred to elsev;here in the mes- 
sage, or in some related message. Sometimes direct refer- 
ences to an unread message by date or serial number in the 
messages that are being read helps provide a wedge into 
the referenced message. 

Similarly, a "Who's \^o" file may furnish footnotes 
to a solved message which will Increase the amount of Infor 

t 

matlon furnish to an intelligence analyst. 

Extensively cross-indexed files are kept in 
operating sections and at higher echelons to furnish aids 
of this sort. This type of information is called "coliater 
al". 

Here we have a large scale data handling problem 
which Includes filing, indexing, cross -indexing, looking up 
abstracting, etc. 



t 



3 ? 



Form 781.C10S 1 Jul 52 



TOP SECRET FROTff 



